Decouple DLC dataframe format from split_individuals in to_dlc_file (revives feature from #314, replaces stale PR #529)#883
Open
swagat-mishra28 wants to merge 5 commits intoneuroinformatics-unit:mainfrom
Conversation
for more information, see https://pre-commit.ci
Author
Author
|
Hi @niksirbi, I've resolved the linting and mypy issues that were causing the CI failures and pushed the fixes. |
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




Related Issue
Fixes #314
This PR reintroduces the feature originally attempted in PR #529, which was closed after becoming stale. The goal of the issue is to decouple two behaviors that are currently tied together in the split_individuals argument of save_poses.to_dlc_file().
Background
Currently, the split_individuals argument controls two independent behaviors simultaneously:
Whether individuals are written to separate files
Which DeepLabCut dataframe format is used
Single-animal format (< DLC 2.0)
Multi-animal format (>= DLC 2.0)
Because these two behaviors are coupled, users cannot independently control:
For example, it is currently not possible to:
save one file per individual while still using the modern multi-animal DLC dataframe format.
Several workflows (including compatibility with tools expecting the DLC >=2.0 format) benefit from being able to control these behaviors independently.
Proposed Solution
This PR introduces a new argument to explicitly control the DLC dataframe format:
dlc_df_format: Literal["single-animal", "multi-animal"] = "multi-animal"
Updated function signature:
def to_dlc_file(
ds: xarray.Dataset,
file_path: str | Path,
split_individuals: bool = False,
dlc_df_format: Literal["single-animal", "multi-animal"] = "multi-animal",
This separates the two previously coupled behaviors:
split_individualsdlc_df_formatDefault Behavior
The default behavior remains fully backward compatible:
split_individuals = False
dlc_df_format = "multi-animal"
This produces a single output file using the modern DLC >=2.0 dataframe format, which is general enough to support both single- and multi-animal datasets.
Implementation Details
The implementation focuses on a minimal and clean change set:
2.Updated the internal dataframe generation logic in to_dlc_style_df.
4.Preserved the existing dataset validation behavior.
Tests
Relevant tests in tests/test_unit/test_save_poses.py were updated to reflect the new parameter and to verify:
Full test suite results:
1027 passed, 4 deselected, 2 xfailed
Coverage remains unchanged:
87% total coverage
Notes
This PR intentionally keeps the scope minimal to focus only on the requested feature.
It supersedes the previous attempt in PR Splitting_Individuals_and_DLC_Format #529 while preserving the original design idea.
No breaking API changes were introduced; the new parameter has a sensible default.